I’m a data scientist in industry, with a background in social science research. I am deeply interested in programming, sociology, the data science industry (and making it better), and education.
Contact me on twitter if you…
To see more about what kinds of data science I do, check out my projects on this page, or my Github profile.
I’ll be speaking about when to try using animation in your data viz, and how to do it, in July at R-Ladies/Data Viz Society- check back soon for more details!
Past Appearances:
Listen to my conversation with Eric Kavanagh on Inside Analytics
See my slides from REV 2, 5/23/19 Video coming soon!
See my keynote from satRdays Chicago, 4/27/19
To see my satRday slides close up, head over to my github
R packages for team collaboration: from ODSC East 2018, EARL Roadshow Seattle 2018, and Metis Speaker Series Get the slides and supporting materials on Github
R package construction from the Women’s Package Workshop in February 2019
If you have questions or need help producing your own packages or completing any of my tutorials, hit me up on twitter!
I wrote a silly R package called radlibs that allows you to make your own madlibs. Then I wrote a version in Python. Then I added them to CRAN and pypi. Data science doesn’t always have to be serious. Use install.packages("radlibs") or pip install radlibs to get these packages. Issues and feedback welcome!
I recently co-taught a daylong course for a group of 30 women/gender nonbinary students about how to write R packages- we had a really good time! I analyzed our pre- and post- surveys in a notebook, to check how effective the day was for students.
This project is a kaggle kernel, in which I walked the reader through the process of cleaning and modeling the data from a real estate prices dataset, using linear modeling, random forests, and gradient boosting (xgboost). My most popular kernel to date! This one also produced respectable competition results, and was chosen for special recognition by the Kaggle admins. (I won a mug!)
Update: Read the interview I did regarding this project (and the other fabulous winners)! http://blog.kaggle.com/2017/03/29/predicting-house-prices-playground-competition-winning-kernels
Key Skills: machine learning, data cleaning
I led a team working on the Chicago Lobbying project, which produced some great output, including this visualization of lobbying and aldermen in Chicago. The project is continuing and building out new functionality. I personally cleaned some of the data underlying, but my biggest contribution was organizing, planning, and leadership. Additional results: https://data.world/lilianhj/chicago-lobbyists
Update: Check out a case study by the fine folks at data.world discussing the work that went in to this project: https://medium.com/@sharonbrener/dbf30aeee70b
Among the public datasets available on Kaggle is this one, describing the crimes that have occurred in Austin, TX over a couple of years. This project cleans the data, does some exploratory analysis, and maps various kinds of crime by district
Key Skills: data cleaning, GIS
stephanie@stephaniekirmer.com
Kaggle | Twitter | Github | Linkedin
See what I’m reading on Pocket: http://getpocket.com/@data_stephanie
This site is built in RMarkdown and a bit of Javascript.